Get started with Document AI in the Google Cloud console

Create a Document AI OCR processor and extract text from a PDF

1. Enable Document AI in a Google Cloud project

2. Create a document OCR Processor, which can identify and extract text from different types of documents.

3. Use the processor to extract text from a sample document.

Create a document OCR processor

consoleの「Processor gallery」から

General

このチュートリアルでは「Document OCR」を選ぶ

2020のStableバージョン（新しいものはrc）

Specialized

特定のドキュメント用と思われる

作ったらprediction endpointもできていた

Test processor

作ったprocessorの「Processor Details」に「Test your processor」がある

Upload Test Documentする

analysisページが見える

JSONとしてexportもできる

text

pages

boundingの情報

exportは日本語非対応っぽい（画面では見えているが）

Edit OCR configを見るに、全部オフにしていそう